1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
S 50
R 51
1st Pass Pages
1019763_FM_VOL-I.qxp 9/17/07 4:22 PM Page viii
This page was intentionally left blank
Numerical Analysis
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Numerical Analysis
NINTH EDITION
Richard L. Burden
Youngstown State University
J. Douglas Faires
Youngstown State University
Australia
Brazil
Japan
Korea
Mexico
Singapore
Spain
United Kingdom
United States
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
This is an electronic version of the print textbook. Due to electronic rights restrictions, some third party content may be suppressed.
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience.
The publisher reserves the right to remove content from this title at any time if subsequent rights restrictions require it.
For valuable information on pricing, previous editions, changes to current editions, and alternate formats,
please visit www.cengage.com/highered to search by ISBN#, author, title, or keyword for materials in your areas of interest.
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Numerical Analysis,
Ninth Edition
Richard L. Burden and J. Douglas Faires
Editor-in-Chief: Michelle Julet
Publisher: Richard Stratton
Senior Sponsoring Editor: Molly Taylor
Associate Editor: Daniel Seibert
Editorial Assistant: Shaylin Walsh
Associate Media Editor: Andrew Coppola
Senior Marketing Manager: Jennifer Pursley Jones
Marketing Coordinator: Erica O’Connell
Marketing Communications Manager: Mary Anne
Payumo
Content Project Manager: Jill Clark
Art Director: Jill Ort
Senior Manufacturing Buyer: Diane Gibbons
Senior Rights Acquisition Specialist: Katie Huha
Production Service: Cadmus Communications
Text Designer: Jay Purcell
Cover Designer: Wing Ngan
Cover Image: Spiral Vortex
Photographer: Akira Inoue
Collection: Amana images, Gettyimages.com
Compositor: Cadmus Communications
© 2011, 2005, 2001 Brooks/Cole, Cengage Learning
ALL RIGHTS RESERVED. No part of this work covered by the copyright herein
may be reproduced, transmitted, stored, or used in any form or by any means
graphic, electronic, or mechanical, including but not limited to photocopying,
recording, scanning, digitizing, taping, Web distribution, information networks,
or information storage and retrieval systems, except as permitted under Section
107 or 108 of the 1976 United States Copyright Act, without the prior written
permission of the publisher.
For product information and technology assistance, contact us at:
Cengage Learning Customer & Sales Support,
1-800-354-9706
For permission to use material from this text or product,
submit all requests online at
www.cengage.com/permissions.
Further permissions questions can be emailed to
permissionrequest@cengage.com.
Library of Congress Control Number: 2010922639
ISBN-13: 978-0-538-73351-9
ISBN-10: 0-538-73351-9
Brooks/Cole
20 Channel Center Street
Boston, MA 02210
USA
Cengage Learning is a leading provider of customized learning solutions with
office locations around the globe, including Singapore, the United Kingdom,
Australia, Mexico, Brazil and Japan. Locate your local office at
international.cengage.com/region.
Cengage Learning products are represented in Canada by Nelson Education, Ltd.
For your course and learning solutions, visit
www.cengage.com.
Purchase any of our products at your local college store or at our preferred
online store www.cengagebrain.com.
Printed in Canada
12345671413121110
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents
Preface ix
1
Mathematical Preliminaries and Error Analysis 1
1.1 Review of Calculus 2
1.2 Round-off Errors and Computer Arithmetic 17
1.3 Algorithms and Convergence 32
1.4 Numerical Software 41
2
Solutions of Equations in One Variable 47
2.1 The Bisection Method 48
2.2 Fixed-Point Iteration 56
2.3 Newton’s Method and Its Extensions 67
2.4 Error Analysis for Iterative Methods 79
2.5 Accelerating Convergence 86
2.6 Zeros of Polynomials and Müller’s Method 91
2.7 Survey of Methods and Software 101
3
Interpolation and Polynomial Approximation 105
3.1 Interpolation and the Lagrange Polynomial 106
3.2 Data Approximation and Neville’s Method 117
3.3 Divided Differences 124
3.4 Hermite Interpolation 136
3.5 Cubic Spline Interpolation 144
3.6 Parametric Curves 164
3.7 Survey of Methods and Software 171
4
Numerical Differentiation and Integration 173
4.1 Numerical Differentiation 174
4.2 Richardson’s Extrapolation 185
4.3 Elements of Numerical Integration 193
v
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
vi Contents
4.4 Composite Numerical Integration 203
4.5 Romberg Integration 213
4.6 Adaptive Quadrature Methods 220
4.7 Gaussian Quadrature 228
4.8 Multiple Integrals 235
4.9 Improper Integrals 250
4.10 Survey of Methods and Software 256
5
Initial-Value Problems for Ordinary Differential
Equations 259
5.1 The Elementary Theory of Initial-Value Problems 260
5.2 Euler’s Method 266
5.3 Higher-Order Taylor Methods 276
5.4 Runge-Kutta Methods 282
5.5 Error Control and the Runge-Kutta-Fehlberg Method 293
5.6 Multistep Methods 302
5.7 Variable Step-Size Multistep Methods 315
5.8 Extrapolation Methods 321
5.9 Higher-Order Equations and Systems of Differential Equations 328
5.10 Stability 339
5.11 Stiff Differential Equations 348
5.12 Survey of Methods and Software 355
6
Direct Methods for Solving Linear Systems 357
6.1 Linear Systems of Equations 358
6.2 Pivoting Strategies 372
6.3 Linear Algebra and Matrix Inversion 381
6.4 The Determinant of a Matrix 396
6.5 Matrix Factorization 400
6.6 Special Types of Matrices 411
6.7 Survey of Methods and Software 428
7
IterativeTechniques in Matrix Algebra 431
7.1 Norms of Vectors and Matrices 432
7.2 Eigenvalues and Eigenvectors 443
7.3 The Jacobi and Gauss-Siedel Iterative Techniques 450
7.4 Relaxation Techniques for Solving Linear Systems 462
7.5 Error Bounds and Iterative Refinement 469
7.6 The Conjugate Gradient Method 479
7.7 Survey of Methods and Software 495
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents vii
8
ApproximationTheory 497
8.1 Discrete Least Squares Approximation 498
8.2 Orthogonal Polynomials and Least Squares Approximation 510
8.3 Chebyshev Polynomials and Economization of Power Series 518
8.4 Rational Function Approximation 528
8.5 Trigonometric Polynomial Approximation 538
8.6 Fast Fourier Transforms 547
8.7 Survey of Methods and Software 558
9
Approximating Eigenvalues 561
9.1 Linear Algebra and Eigenvalues 562
9.2 Orthogonal Matrices and Similarity Transformations 570
9.3 The Power Method 576
9.4 Householder’s Method 593
9.5 The QR Algorithm 601
9.6 Singular Value Decomposition 614
9.7 Survey of Methods and Software 626
10
Numerical Solutions of Nonlinear Systems of
Equations 629
10.1 Fixed Points for Functions of Several Variables 630
10.2 Newton’s Method 638
10.3 Quasi-Newton Methods 647
10.4 Steepest Descent Techniques 654
10.5 Homotopy and Continuation Methods 660
10.6 Survey of Methods and Software 668
11
Boundary-Value Problems for Ordinary Differential
Equations 671
11.1 The Linear Shooting Method 672
11.2 The Shooting Method for Nonlinear Problems 678
11.3 Finite-Difference Methods for Linear Problems 684
11.4 Finite-Difference Methods for Nonlinear Problems 691
11.5 The Rayleigh-Ritz Method 696
11.6 Survey of Methods and Software 711
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
viii Contents
12
Numerical Solutions to Partial Differential
Equations 713
12.1 Elliptic Partial Differential Equations 716
12.2 Parabolic Partial Differential Equations 725
12.3 Hyperbolic Partial Differential Equations 739
12.4 An Introduction to the Finite-Element Method 746
12.5 Survey of Methods and Software 760
Bibliography 763
Answers to Selected Exercises 773
Index 863
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface
About the Text
This book was written for a sequence of courses on the theory and application of numerical
approximation techniques. It is designed primarily for junior-level mathematics, science,
and engineering majors who have completed at least the standard college calculus sequence.
Familiarity with the fundamentals of linear algebra and differential equations is useful, but
there is sufficient introductory material on these topics so that courses in these subjects are
not needed as prerequisites.
Previous editions of Numerical Analysis have been used in a wide variety of situations.
In some cases, the mathematical analysis underlying the development of approximation
techniques was given more emphasis than the methods; in others, the emphasis was re-
versed. The book has been used as a core reference for beginning graduate level courses
in engineering and computer science programs and in first-year courses in introductory
analysis offered at international universities. We have adapted the book to fit these diverse
users without compromising our original purpose:
To introduce modern approximation techniques; to explain how, why, and when they
can be expected to work; and to provide a foundation for further study of numerical
analysis and scientific computing.
The book contains sufficient material for at least a full year of study, but we expect many
people to use it for only a single-term course. In such a single-term course, students learn
to identify the types of problems that require numerical techniques for their solution and
see examples of the error propagation that can occur when numerical methods are applied.
They accurately approximate the solution of problems that cannot be solved exactly and
learn typical techniques for estimating error bounds for the approximations. The remainder
of the text then serves as a reference for methods not considered in the course. Either the
full-year or single-course treatment is consistent with the philosophy of the text.
Virtually every concept in the text is illustrated by example, and this edition contains
more than 2600 class-tested exercises ranging from elementary applications of methods
and algorithms to generalizations and extensions of the theory. In addition, the exercise
sets include numerous applied problems from diverse areas of engineering as well as from
the physical, computer, biological, economic, and social sciences. The chosen applications
clearly and concisely demonstrate how numerical techniques can be, and often must be,
applied in real-life situations.
A number of software packages, known as Computer Algebra Systems (CAS), have
been developed to produce symbolic mathematical computations. Maple
®
, Mathematica
®
,
and MATLAB
®
are predominant among these in the academic environment, and versions
of these software packages are available for most common computer systems. In addition,
Sage, a free open source system, is now available. This system was developed primarily
ix
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
x Preface
by William Stein at the University of Washington, and was first released in February 2005.
Information about Sage can be found at the site
http://www.sagemath.org .
Although there are differences among the packages, both in performance and price, all can
perform standard algebra and calculus operations.
The results in most of our examples and exercises have been generated using problems
for which exact solutions are known, because this permits the performance of the approxi-
mation method to be more easily monitored. For many numerical techniques the error
analysis requires bounding a higher ordinary or partial derivative, which can be a tedious
task and one that is not particularly instructive once the techniques of calculus have been
mastered. Having a symbolic computation package available can be very useful in the study
of approximation techniques, because exact values for derivatives can easily be obtained. A
little insight often permits a symbolic computation to aid in the bounding process as well.
We have chosen Maple as our standard package because of its wide academic distri-
bution and because it now has a NumericalAnalysis package that contains programs that
parallel the methods and algorithms in our text. However, other CAS can be substituted with
only minor modifications. Examples and exercises have been added whenever we felt that
a CAS would be of significant benefit, and we have discussed the approximation methods
that CAS employ when they are unable to solve a problem exactly.
Algorithms and Programs
In our first edition we introduced a feature that at the time was innovative and somewhat
controversial. Instead of presenting our approximation techniques in a specificprogramming
language (FORTRAN was dominant at the time), we gave algorithms in a pseudo code that
would lead to a well-structured program in a variety of languages. The programs are coded
and available online in most common programming languages and CAS worksheet formats.
All of these are on the web site for the book:
http://www.math.ysu.edu/faires/Numerical-Analysis/ .
For each algorithm there is a program written in FORTRAN, Pascal, C, and Java. In addition,
we have coded the programs using Maple, Mathematica, and MATLAB. This should ensure
that a set of programs is available for most common computing systems.
Every program is illustrated with a sample problem that is closely correlated to the text.
This permits the program to be run initially in the language of your choice to see the form
of the input and output. The programs can then be modified for other problems by making
minor changes. The form of the input and output are, as nearly as possible, the same in
each of the programming systems. This permits an instructor using the programs to discuss
them generically, without regard to the particular programming system an individual student
chooses to use.
The programs are designed to run on a minimally configured computer and given in
ASCII format for flexibility of use. This permits them to be altered using any editor or word
processor that creates standard ASCII files (commonly called “Text Only” files). Extensive
README files are included with the program files so that the peculiarities of the various
programming systems can be individually addressed. The README files are presented
both in ASCII format and as PDF files. As new software is developed, the programs will
be updated and placed on the web site for the book.
For most of the programming systems the appropriate software is needed, such as a
compiler for Pascal, FORTRAN, and C, or one of the computer algebra systems (Maple,
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface xi
Mathematica, and MATLAB). The Java implementations are an exception. You need the
system to run the programs, but Java can be freely downloaded from various sites. The best
way to obtain Java is to use a search engine to search on the name, choose a download site,
and follow the instructions for that site.
New for This Edition
The first edition of this book was published more than 30 years ago, in the decade after major
advances in numerical techniques were made to reflect the new widespread availability of
computer equipment. In our revisions of the book we have added new techniques in order
to keep our treatment current. To continue this trend, we have made a number of significant
changes to the ninth edition.
Our treatment of Numerical Linear Algebra has been extensively expanded, and con-
stitutes one of major changes in this edition. In particular, a section on Singular Value
Decomposition has been added at the end of Chapter 9. This required a complete rewrite
of the early part of Chapter 9 and considerable expansion of Chapter 6 to include neces-
sary material concerning symmetric and orthogonal matrices. Chapter 9 is approximately
40% longer than in the eighth edition, and contains a significant number of new examples
and exercises. Although students would certainly benefit from a course in Linear Algebra
before studying this material, sufficient background material is included in the book, and
every result whose proof is not given is referenced to at least one commonly available
source.
All the Examples in the book have been rewritten to better emphasize the problem to
be solved before the specific solution is presented. Additional steps have been added to
many of the examples to explicitly show the computations required for the first steps of
iteration processes. This gives the reader a way to test and debug programs they have
written for problems similar to the examples.
A new item designated as an Illustration has been added. This is used when discussing a
specific application of a method not suitable for the problem statement-solution format
of the Examples.
The Maple code we include now follows, whenever possible, the material included in
their NumericalAnalysis package. The statements given in the text are precisely what is
needed for the Maple worksheet applications, and the output is given in the same font
and color format that Maple produces.
A number of sections have been expanded, and some divided, to make it easier for instruc-
tors to assign problems immediately after the material is presented. This is particularly
true in Chapters 3, 6, 7, and 9.
Numerous new historical notes have been added, primarily in the margins where they
can be considered independent of the text material. Much of the current material used in
Numerical Analysis was developed in middle of the 20th century, and students should be
aware that mathematical discoveries are ongoing.
The bibliographic material has been updated to reflect new editions of books that we
reference. New sources have been added that were not previously available.
As always with our revisions, every sentence was examined to determine if it was phrased
in a manner that best relates what is described.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xii Preface
Supplements
A Student Solutions Manual and Study Guide (ISBN-10: 0-538-73351-9; ISBN-13: 978-0-
538-73351-9) is available for purchase with this edition, and contains worked-out solutions
to many of the problems. The solved exercises cover all of the techniques discussed in the
text, and include step-by-step instructions for working through the algorithms. The first two
chapters of this Guide are available for preview on the web site for the book.
Complete solutions to all exercises in the text are available to instructors in secure,
customizable online format through the Cengage Solution Builder service. Adopting in-
structors can sign up for access at www.cengage.com/solutionbuilder. Computation results
in these solutions were regenerated for this edition using the programs on the web site to
ensure compatibility among the various programming systems.
A set of classroom lecture slides, prepared by Professor John Carroll of Dublin City
University, are available on the book’s instructor companion web site at www.cengage.
com/math/burden. These slides, created using the Beamer package of LaTeX, are in PDF
format. They present examples, hints, and step-by-step animations of important techniques
in Numerical Analysis.
Possible Course Suggestions
Numerical Analysis is designed to give instructors flexibility in the choice of topics as well
as in the level of theoretical rigor and in the emphasis on applications. In line with these
aims, we provide detailed references for results not demonstrated in the text and for the
applications used to indicate the practical importance of the methods. The text references
cited are those most likely to be available in college libraries, and they have been updated to
reflect recent editions. We also include quotations from original research papers when we
feel this material is accessible to our intended audience. All referenced material has been
indexed to the appropriate locations in the text, and Library of Congress information for
reference material has been included to permit easy location if searching for library material.
The following flowchart indicates chapter prerequisites. Most of the possible sequences
that can be generated from this chart have been taught by the authors at Youngstown State
University.
Chapter 6Chapter 2 Chapter 3
Chapter 7Chapter 10 Chapter 8
Chapter 9
Chapter 11
Chapter 12
Chapter 4 Chapter 5
Chapter 1
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface xiii
The additional material in this edition should permit instructors to prepare an under-
graduate course in Numerical Linear Algebra for students who have not previously studied
Numerical Analysis. This could be done by covering Chapters 1, 6, 7, and 9, and then, as
time permits, including other material of the instructor’s choice.
Acknowledgments
We have been fortunate to have had many of our students and colleagues give us their
impressions of earlier editions of this book. We have tried to include all the suggestions
that complement the philosophy of the book, and we are extremely grateful to all those who
have taken the time to contact us about ways to improve subsequent versions.
We would particularly like to thank the following, whose suggestions we have used in
this and previous editions.
John Carroll, Dublin City University (Ireland)
Gustav Delius, University of York (UK)
Pedro José Paúl Escolano, University of Sevilla (Spain)
Warren Hickman, Westminster College
Jozsi Jalics, Youngstown State University
Dan Kalman, American University
Robert Lantos, University of Ottawa (Canada)
Eric Rawdon, Duquesne University
Phillip Schmidt, University of Northern Kentucky
Kathleen Shannon, Salisbury University
Roy Simpson, State University of New York, Stony Brook
Dennis C. Smolarski, Santa Clara University
Richard Varga, Kent State University
James Verner, Simon Fraser University (Canada)
André Weideman, University of Stellenbosch (South Africa)
Joan Weiss, Fairfield University
Nathaniel Whitaker, University of Massachusetts at Amherst
Dick Wood, Seattle Pacific University
George Yates, Youngstown State University
As has been our practice in past editions of the book, we used undergraduate student
help at Youngstown State University in preparing the ninth edition. Our assistant for this
edition was Mario Sracic, who checked the new Maple code in the book and worked as our
in-house copy editor. In addition, Edward Burden has been checking all the programs that
accompany the text. We would like to express gratitude to our colleagues on the faculty and
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xiv Preface
administration of Youngstown State University for providing us the opportunity, facilities,
and encouragement to complete this project.
We would also like to thank some people who have made significant contributions
to the history of numerical methods. Herman H. Goldstine has written an excellent book
entitled A History of Numerical Analysis from the 16th Through the 19th Century [Golds].
In addition, The words of mathematics [Schw], by Steven Schwartzman has been a help
in compiling our historical material. Another source of excellent historical mathematical
knowledge is the MacTutor History of Mathematics archive at the University of St. Andrews
in Scotland. It has been created by John J. O’Connor and Edmund F. Robertson and has the
internet address
http://www-gap.dcs.st-and.ac.uk/history/ .
An incredible amount of work has gone into creating the material on this site, and we have
found the information to be unfailingly accurate. Finally, thanks to all the contributors to
Wikipedia who have added their expertise to that site so that others can benefit from their
knowledge.
In closing, thanks again to those who have spent the time and effort to contact us
over the years. It has been wonderful to hear from so many students and faculty who used
our book for their first exposure to the study of numerical methods. We hope this edition
continues this exchange, and adds to the enjoyment of students studying numerical analysis.
If you have any suggestions for improving future editions of the book, we would, as always,
be grateful for your comments. We can be contacted most easily by electronic mail at the
addresses listed below.
Richard L. Burden
burden@math.ysu.edu
J. Douglas Faires
faires@math.ysu.edu
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
CHAPTER
1 Mathematical Preliminaries
and Error Analysis
Introduction
In beginning chemistry courses, we see the ideal gas law,
PV = NRT,
which relates the pressure P, volume V, temperature T, and number of moles N of an
“ideal” gas. In this equation, R is a constant that depends on the measurement system.
Suppose two experiments are conducted to test this law, using the same gas in each
case. In the first experiment,
P = 1.00 atm, V = 0.100 m
3
,
N = 0.00420 mol, R = 0.08206.
The ideal gas law predicts the temperature of the gas to be
T =
PV
NR
=
(1.00)(0.100)
(0.00420)(0.08206)
= 290.15 K = 17
C.
When we measure the temperature of the gas however, we find that the true temperature is
15
C.
V
1
V
2
We then repeat the experiment using the same values of R and N, but increase the
pressure by a factor of two and reduce the volume by the same factor. The product PV
remains the same, so the predicted temperature is still 17
C. But now we find that the actual
temperature of the gas is 19
C.
1
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2 CHAPTER 1 Mathematical Preliminaries and Error Analysis
Clearly, the ideal gas law is suspect, but before concluding that the law is invalid in
this situation, we should examine the data to see whether the error could be attributed to
the experimental results. If so, we might be able to determine how much more accurate
our experimental results would need to be to ensure that an error of this magnitude did not
occur.
Analysis of the error involved in calculations is an important topic in numerical analysis
and is introduced in Section 1.2. This particular application is considered in Exercise 28 of
that section.
This chapter contains a short review of those topics from single-variable calculus that
will be needed in later chapters. A solid knowledge of calculus is essential for an understand-
ing of the analysis of numerical techniques, and more thorough review might be needed if
you have been away from this subject for a while. In addition there is an introduction to
convergence, error analysis, the machine representation of numbers, and some techniques
for categorizing and minimizing computational error.
1.1 Review of Calculus
Limits and Continuity
The concepts of limit and continuity of a function are fundamental to the study of calculus,
and form the basis for the analysis of numerical techniques.
Definition 1.1 A function f defined on a set X of real numbers has the limit L at x
0
, written
lim
xx
0
f(x) = L,
if, given any real number ε>0, there exists a real number δ>0 such that
|f(x) L| , whenever x X and 0 < |x x
0
| .
(See Figure 1.1.)
Figure 1.1
x
L ε
L ε
L
x
0
δ x
0
δx
0
y
y f(x)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Review of Calculus 3
Definition 1.2 Let f be a function defined on a set X of real numbers and x
0
X. Then f is continuous
at x
0
if
lim
xx
0
f(x) = f(x
0
).
The function f is continuous on the set X if it is continuous at each number in X.
The set of all functions that are continuous on the set X is denoted C(X). When X is
an interval of the real line, the parentheses in this notation are omitted. For example, the
set of all functions continuous on the closed interval [a, b] is denoted C[a, b]. The symbol
R denotes the set of all real numbers, which also has the interval notation (−∞, ).So
the set of all functions that are continuous at every real number is denoted by C(R) or by
C(−∞, ).
The basic concepts of calculus
and its applications were
developed in the late 17th and
early 18th centuries, but the
mathematically precise concepts
of limits and continuity were not
described until the time of
Augustin Louis Cauchy
(1789–1857), Heinrich Eduard
Heine (1821–1881), and Karl
Weierstrass (1815 –1897) in the
latter portion of the 19th century.
The limit of a sequence of real or complex numbers is defined in a similar manner.
Definition 1.3 Let{x
n
}
n=1
be aninfinite sequence of real numbers.This sequence has the limit x (converges
to x) if, for any ε>0 there exists a positive integer N) such that |x
n
x| , whenever
n > N). The notation
lim
n→∞
x
n
= x,orx
n
x as n →∞,
means that the sequence {x
n
}
n=1
converges to x.
Theorem 1.4 If f is a function defined on a set X of real numbers and x
0
X, then the following
statements are equivalent:
a. f is continuous at x
0
;
b. If {x
n
}
n=1
is any sequence in X converging to x
0
, then lim
n→∞
f(x
n
) = f(x
0
).
The functions we will consider when discussing numerical methods will be assumed
to be continuous because this is a minimal requirement for predictable behavior. Functions
that are not continuous can skip over points of interest, which can cause difficulties when
attempting to approximate a solution to a problem.
Differentiability
More sophisticated assumptions about a function generally lead to better approximation
results. For example, a function with a smooth graph will normally behave more predictably
than one with numerous jagged features. The smoothness condition relies on the concept
of the derivative.
Definition 1.5 Let f bea function defined inan open interval containing x
0
. The function f isdifferentiable
at x
0
if
f
(x
0
) = lim
xx
0
f(x) f(x
0
)
x x
0
exists. The number f
(x
0
) is called the derivative of f at x
0
. A function that has a derivative
at each number in a set X is differentiable on X.
The derivative of f at x
0
is the slope of the tangent line to the graph of f at (x
0
, f(x
0
)),
as shown in Figure 1.2.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4 CHAPTER 1 Mathematical Preliminaries and Error Analysis
Figure 1.2
x
y
y f(x)
(x
0
, f (x
0
))
f(x
0
)
x
0
The tangent line has slope f(x
0
)
Theorem 1.6 If the function f is differentiable at x
0
, then f is continuous at x
0
.
The next theorems are of fundamental importance in deriving methods for error esti-
mation. The proofs of these theorems and the other unreferenced results in this section can
be found in any standard calculus text.
The theorem attributed to Michel
Rolle (1652–1719) appeared in
1691 in a little-known treatise
entitled Méthode pour résoundre
les égalites. Rolle originally
criticized the calculus that was
developed by Isaac Newton and
Gottfried Leibniz, but later
became one of its proponents.
The set of all functions that have n continuous derivatives on X is denoted C
n
(X), and
the set of functions that have derivatives of all orders on X is denoted C
(X). Polynomial,
rational, trigonometric, exponential, and logarithmic functions are in C
(X), where X
consists of all numbers for which the functions are defined. When X is an interval of the
real line, we will again omit the parentheses in this notation.
Theorem 1.7 (Rolle’s Theorem)
Suppose f C[a, b] and f is differentiable on (a, b).Iff(a) = f(b), then a number c in
(a, b) exists with f
(c) = 0. (See Figure 1.3.)
Figure 1.3
x
f(c) 0
a
b
c
f(a) f(b)
y
y f(x)
Theorem 1.8 (Mean Value Theorem)
If f C[a, b] and f is differentiable on (a, b), then a number c in (a, b) exists with (See
Figure 1.4.)
f
(c) =
f(b) f(a)
b a
.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Review of Calculus 5
Figure 1.4
y
x
a
bc
Slope f(c)
Parallel lines
Slope
b a
f(b) f (a)
y f(x)
Theorem 1.9 (Extreme Value Theorem)
If f C[a, b], then c
1
, c
2
∈[a, b] exist with f(c
1
) f(x) f(c
2
), for all x ∈[a, b].
In addition, if f is differentiable on (a, b), then the numbers c
1
and c
2
occur either at the
endpoints of [a, b] or where f
is zero. (See Figure 1.5.)
Figure 1.5
y
x
a
c
2
c
1
b
y f(x)
Research work on the design of
algorithms and systems for
performing symbolic
mathematics began in the 1960s.
The first system to be operational,
in the 1970s, was a LISP-based
system called MACSYMA.
As mentioned in the preface, we will use the computer algebra system Maple whenever
appropriate. Computer algebra systems are particularly useful for symbolic differentiation
and plotting graphs. Both techniques are illustrated in Example 1.
Example 1 Use Maple to find the absolute minimum and absolute maximum values of
f(x) = 5 cos 2x 2x sin 2xf(x)
on the intervals (a) [1, 2], and (b) [0.5, 1]
Solution There is a choice of Text input or Math input under the Maple C 2D Math option.
The Text input is used to document worksheets by adding standard text information in
the document. The Math input option is used to execute Maple commands. Maple input
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6 CHAPTER 1 Mathematical Preliminaries and Error Analysis
can either be typed or selected from the pallets at the left of the Maple screen. We will
show the input as typed because it is easier to accurately describe the commands. For pallet
input instructions you should consult the Maple tutorials. In our presentation, Maple input
commands appear in italic type, and Maple responses appear in cyan type.
To ensure that the variables we use have not been previously assigned, we first issue
the command.
The Maple development project
began at the University of
Waterloo in late 1980. Its goal
was to be accessible to
researchers in mathematics,
engineering, and science, but
additionally to students for
educational purposes. To be
effective it needed to be portable,
as well as space and time
efficient. Demonstrations of the
system were presented in 1982,
and the major paper setting out
the design criteria for the
MAPLE system was presented in
1983 [CGGG].
restart
to clear the Maple memory. We first illustrate the graphing capabilities of Maple. To access
the graphing package, enter the command
with(plots)
to load the plots subpackage. Maple responds with a list of available commands in the
package. This list can be suppressed by placing a colon after the with(plots) command.
The following command defines f(x) = 5 cos 2x 2x sin 2x as a function of x.
f := x 5 cos(2x) 2x · sin(2x)
and Maple responds with
x 5 cos(2x) 2x sin(2x)
We can plot the graph of f on the interval [0.5, 2] with the command
plot(f ,0.5..2)
Figure 1.6 shows the screen that results from this command after doing a mouse click on
the graph. This click tells Maple to enter its graph mode, which presents options for various
views of the graph. We can determine the coordinates of a point of the graph by moving the
mouse cursor to the point. The coordinates appear in the box above the left of the plot(f ,
0.5..2)command. This feature is useful for estimating the axis intercepts and extrema of
functions.
The absolute maximum and minimum values of f(x) on the interval [a, b] can occur
only at the endpoints, or at a critical point.
(a) When the interval is [1, 2] we have
f(1) =5 cos 2 2 sin 2 =−3.899329036 and f(2) =5 cos 4 4 sin 4 =−0.241008123.
A critical point occurs when f
(x) = 0. To use Maple to find this point, we first define a
function fp to represent f
with the command
fp := x diff(f (x), x)
and Maple responds with
x
d
dx
f(x)
To find the explicit representation of f
(x) we enter the command
fp(x)
and Maple gives the derivative as
12 sin(2x) 4x cos(2x)
To determine the critical point we use the command
fsolve( fp(x), x,1..2)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Review of Calculus 7
Figure 1.6
and Maple tells us that f
(x) = fp(x) = 0 for x in [1, 2] when x is
1.358229874
We evaluate f(x) at this point with the command
f(%)
The % is interpreted as the last Maple response. The value of f at the critical point is
5.675301338
As a consequence, the absolute maximum value of f(x) in [1, 2] is f(2) =−0.241008123
and the absolute minimum value is f(1.358229874) =−5.675301338, accurate at least to
the places listed.
(b) When the interval is [0.5, 1] we have the values at the endpoints given by
f(0.5) =5 cos 1 1 sin 1 =1.860040545 and f(1) =5 cos 2 2 sin 2 =−3.899329036.
However, when we attempt to determine the critical point in the interval [0.5, 1] with the
command
fsolve( fp(x), x,0.5..1)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8 CHAPTER 1 Mathematical Preliminaries and Error Analysis
Maple gives the response
f solve(12 sin(2x) 4x cos(2x), x,.5..1)
This indicates that Maple is unable to determine the solution. The reason is obvious once
the graph in Figure 1.6 is considered. The function f is always decreasing on this interval,
so no solution exists. Be suspicious when Maple returns the same response it is given; it is
as if it was questioning your request.
In summary, on [0.5, 1] the absolute maximum value is f(0.5) = 1.86004545 and
the absolute minimum value is f(1) =−3.899329036, accurate at least to the places
listed.
The following theorem is not generally presented in a basic calculus course, but is
derived by applying Rolle’s Theorem successively to f , f
, ..., and, finally, to f
(n1)
.
This result is considered in Exercise 23.
Theorem 1.10 (Generalized Rolle’s Theorem)
Suppose f C[a, b] is n times differentiable on (a, b).Iff(x) = 0 at the n + 1 distinct
numbers a x
0
< x
1
< ... < x
n
b, then a number c in (x
0
, x
n
), and hence in (a, b),
exists with f
(n)
(c) = 0.
We will also make frequent use of the Intermediate Value Theorem. Although its state-
ment seems reasonable, its proof is beyond the scope of the usual calculus course. It can,
however, be found in most analysis texts.
Theorem 1.11 (Intermediate Value Theorem)
If f C[a, b] and K is any number between f(a) and f(b), then there exists a number c
in (a, b) for which f(c) = K.
Figure 1.7 shows one choice for the number that is guaranteed by the Intermediate
Value Theorem. In this example there are two other possibilities.
Figure 1.7
x
y
f(a)
f(b)
y f(x)
K
(a, f (a))
(b, f (b))
a
b
c
Example 2 Show that x
5
2x
3
+ 3x
2
1 = 0 has a solution in the interval [0, 1].
Solution Consider the function defined by f(x) = x
5
2x
3
+ 3x
2
1. The function f is
continuous on [0, 1]. In addition,
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Review of Calculus 9
f(0) =−1 < 0 and 0 < 1 = f(1).
The Intermediate Value Theorem implies that a number x exists, with 0 < x < 1, for which
x
5
2x
3
+ 3x
2
1 = 0.
As seen in Example 2, the Intermediate Value Theorem is used to determine when
solutions to certain problems exist. It does not, however, give an efficient means for finding
these solutions. This topic is considered in Chapter 2.
Integration
The other basic concept of calculus that will be used extensively is the Riemann integral.
George Fredrich Berhard
Riemann (1826–1866) made
many of the important
discoveries classifying the
functions that have integrals. He
also did fundamental work in
geometry and complex function
theory, and is regarded as one of
the profound mathematicians of
the nineteenth century.
Definition 1.12 The Riemann integral of the function f on the interval [a, b] is the following limit,
provided it exists:
b
a
f(x) dx = lim
max x
i
0
n
i=1
f(z
i
)x
i
,
where the numbers x
0
, x
1
, ..., x
n
satisfy a = x
0
x
1
···x
n
= b, wherex
i
= x
i
x
i1
,
for each i = 1, 2, ..., n , and z
i
is arbitrarily chosen in the interval [x
i1
, x
i
].
A function f that is continuous on an interval [a, b] is also Riemann integrable on
[a, b]. This permits us to choose, for computational convenience, the points x
i
to be equally
spaced in [a, b], and for each i = 1, 2, ..., n, to choose z
i
= x
i
. In this case,
b
a
f(x) dx = lim
n→∞
b a
n
n
i=1
f(x
i
),
where the numbers shown in Figure 1.8 as x
i
are x
i
= a + i(b a)/n.
Figure 1.8
y
x
y f(x)
a x
0
x
1
x
2
x
i1
x
i
x
n1
b x
n
. . .
. . .
Two other results will be needed in our study of numerical analysis. The first is a
generalization of the usual Mean Value Theorem for Integrals.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10 CHAPTER 1 Mathematical Preliminaries and Error Analysis
Theorem 1.13 (Weighted Mean Value Theorem for Integrals)
Suppose f C[a, b], the Riemann integral of g exists on [a, b], and g(x) does not change
sign on [a, b]. Then there exists a number c in (a, b) with
b
a
f(x)g(x) dx = f(c)
b
a
g(x) dx.
When g(x) 1, Theorem 1.13 is the usual Mean Value Theorem for Integrals. It gives
the average value of the function f over the interval [a, b] as (See Figure 1.9.)
f(c) =
1
b a
b
a
f(x) dx.
Figure 1.9
x
y
f(c)
y f(x)
abc
The proof of Theorem 1.13 is not generally given in a basic calculus course but can be
found in most analysis texts (see, for example, [Fu], p. 162).
Taylor Polynomials and Series
The final theorem in this review from calculus describes the Taylor polynomials. These
polynomials are used extensively in numerical analysis.
Theorem 1.14 (Taylor’s Theorem)
Suppose f C
n
[a, b], that f
(n+1)
exists on [a, b], and x
0
∈[a, b]. For every x ∈[a, b],
there exists a number ξ(x) between x
0
and x with
Brook Taylor (1685–1731)
described this series in 1715 in
the paper Methodus
incrementorum directa et inversa.
Special cases of the result, and
likely the result itself, had been
previously known to Isaac
Newton, James Gregory, and
others.
f(x) = P
n
(x) + R
n
(x),
where
P
n
(x) =f(x
0
) + f
(x
0
)(x x
0
) +
f

(x
0
)
2!
(x x
0
)
2
+···+
f
(n)
(x
0
)
n!
(x x
0
)
n
=
n
k=0
f
(k)
(x
0
)
k!
(x x
0
)
k
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Review of Calculus 11
and
R
n
(x) =
f
(n+1)
(x))
(n + 1)!
(x x
0
)
n+1
.
Here P
n
(x) is called the nth Taylor polynomial for f about x
0
, and R
n
(x) is called
the remainder term (or truncation error) associated with P
n
(x). Since the number ξ(x)
in the truncation error R
n
(x) depends on the value of x at which the polynomial P
n
(x) is
being evaluated, it is a function of the variable x. However, we should not expect to be
able to explicitly determine the function ξ(x). Taylor’s Theorem simply ensures that such a
function exists, and that its value lies between x and x
0
. In fact, one of the common problems
in numerical methods is to try to determine a realistic bound for the value of f
(n+1)
(x))
when x is in some specified interval.
Colin Maclaurin (1698–1746) is
best known as the defender of the
calculus of Newton when it came
under bitter attack by the Irish
philosopher, the Bishop George
Berkeley.
The infinite series obtained by taking the limit of P
n
(x) as n →∞is called the Taylor
series for f about x
0
. In the case x
0
= 0, the Taylor polynomial is often called a Maclaurin
polynomial, and the Taylor series is often called a Maclaurin series.
Maclaurin did not discover the
series that bears his name; it was
known to 17th century
mathematicians before he was
born. However, he did devise a
method for solving a system of
linear equations that is known as
Cramer’s rule, which Cramer did
not publish until 1750.
The term truncation error in the Taylor polynomial refers to the error involved in
using a truncated, or finite, summation to approximate the sum of an infinite series.
Example 3 Let f(x) = cos x and x
0
= 0. Determine
(a) the second Taylor polynomial for f about x
0
; and
(b) the third Taylor polynomial for f about x
0
.
Solution Since f C
(R), Taylor’s Theorem can be applied for any n 0. Also,
f
(x) =−sin x, f

(x) =−cos x, f

(x) = sin x, and f
(4)
(x) = cos x,
so
f(0) = 1, f
(0) = 0, f

(0) =−1, and f

(0) = 0.
(a) For n = 2 and x
0
= 0, we have
cos x = f(0) + f
(0)x +
f

(0)
2!
x
2
+
f

(x))
3!
x
3
= 1
1
2
x
2
+
1
6
x
3
sin ξ(x),
where ξ(x) is some (generally unknown) number between 0 and x. (See Figure 1.10.)
Figure 1.10
y
x
y cos x
y P
2
(x) 1 x
2
1
ππ
2
π
2
π
2
1
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
12 CHAPTER 1 Mathematical Preliminaries and Error Analysis
When x = 0.01, this becomes
cos 0.01 = 1
1
2
(0.01)
2
+
1
6
(0.01)
3
sin ξ(0.01) = 0.99995 +
10
6
6
sin ξ(0.01).
The approximation to cos 0.01 given by the Taylor polynomial is therefore 0.99995. The
truncation error, or remainder term, associated with this approximation is
10
6
6
sin ξ(0.01) = 0.1
6 × 10
6
sin ξ(0.01),
where the bar over the 6 in 0.1
6 is used to indicate that this digit repeats indefinitely.
Although we have no way of determining sin ξ(0.01), we know that all values of the sine
lie in the interval [−1, 1], so the error occurring if we use the approximation 0.99995 for
the value of cos 0.01 is bounded by
|cos(0.01) 0.99995|=0.1
6 × 10
6
|sin ξ(0.01)|≤0.16 × 10
6
.
Hence the approximation 0.99995 matches at least the first five digits of cos 0.01, and
0.9999483 < 0.99995 1.
6 × 10
6
cos 0.01
0.99995 + 1.
6 × 10
6
< 0.9999517.
The error bound is much larger than the actual error. This is due in part to the poor
bound we used for |sin ξ(x)|. It is shown in Exercise 24 that for all values of x,wehave
|sin x|≤|x|. Since 0 ξ<0.01, we could have used the fact that |sin ξ(x)|≤0.01 in the
error formula, producing the bound 0.1
6 × 10
8
.
(b) Since f

(0) = 0, the third Taylor polynomial with remainder term about x
0
= 0
is
cos x = 1
1
2
x
2
+
1
24
x
4
cos
˜
ξ(x),
where 0 <
˜
ξ(x)<0.01. The approximating polynomial remains the same, and the ap-
proximation is still 0.99995, but we now have much better accuracy assurance. Since
|cos
˜
ξ(x)|≤1 for all x,wehave
1
24
x
4
cos
˜
ξ(x)
1
24
(0.01)
4
(1) 4.2 × 10
10
.
So
|cos 0.01 0.99995|≤4.2 × 10
10
,
and
0.99994999958 = 0.99995 4.2 ×10
10
cos 0.01 0.99995 + 4.2 ×10
10
= 0.99995000042.
Example 3 illustrates the two objectives of numerical analysis:
(i) Find an approximation to the solution of a given problem.
(ii) Determine a bound for the accuracy of the approximation.
The Taylor polynomials in both parts provide the same answer to (i), but the third Taylor
polynomial gave a much better answer to (ii) than the second Taylor polynomial.
We can also use the Taylor polynomials to give us approximations to integrals.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Review of Calculus 13
Illustration We can use the third Taylor polynomial and its remainder term found in Example 3 to
approximate
0.1
0
cos xdx.Wehave
0.1
0
cos xdx=
0.1
0
1
1
2
x
2
dx +
1
24
0.1
0
x
4
cos
˜
ξ(x) dx
=
x
1
6
x
3
0.1
0
+
1
24
0.1
0
x
4
cos
˜
ξ(x) dx
= 0.1
1
6
(0.1)
3
+
1
24
0.1
0
x
4
cos
˜
ξ(x) dx.
Therefore
0.1
0
cos xdx 0.1
1
6
(0.1)
3
= 0.09983.
A bound for the error in this approximation is determined from the integral of the Taylor
remainder term and the fact that |cos
˜
ξ(x)|≤1 for all x:
1
24
0.1
0
x
4
cos
˜
ξ(x) dx
1
24
0.1
0
x
4
|cos
˜
ξ(x)| dx
1
24
0.1
0
x
4
dx =
(0.1)
5
120
= 8.
3 × 10
8
.
The true value of this integral is
0.1
0
cos xdx= sin x
0.1
0
= sin 0.1 0.099833416647,
so the actual error for this approximation is 8.3314 × 10
8
, which is within the error
bound.
We can also use Maple to obtain these results. Define f by
f := cos(x )
Maple allows us to place multiple statements on a line separated by either a semicolon or
a colon. A semicolon will produce all the output, and a colon suppresses all but the final
Maple response. For example, the third Taylor polynomial is given by
s3:= taylor(f , x = 0, 4) : p3:= convert(s3, polynom)
1
1
2
x
2
The first statement s3:=taylor(f , x = 0, 4) determines the Taylor polynomial about
x
0
= 0 with four terms (degree 3) and an indication of its remainder. The second p3:=
convert(s3, polynom) converts the series s3 to the polynomial p3 by dropping the remainder
term.
Maple normally displays 10 decimal digits for approximations. To instead obtain the
11 digits we want for this illustration, enter
Digits := 11
and evaluate f(0.01) and P
3
(0.01) with
y1:=evalf(subs(x = 0.01, f)); y2:=evalf(subs(x = 0.01, p3)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
14 CHAPTER 1 Mathematical Preliminaries and Error Analysis
This produces
0.99995000042
0.99995000000
To show both the function (in black) and the polynomial (in cyan) near x
0
= 0, we enter
plot
(
(f , p3), x =−2..2
)
and obtain the Maple plot shown in Figure 1.11.
Figure 1.11
–2 −1 1
x
2
1
0.5
0
–0.5
–1
The integrals of f and the polynomial are given by
q1:= int(f , x = 0 . . 0.1); q 2:= int(p3, x = 0 . . 0.1)
0.099833416647
0.099833333333
We assigned the names q1 and q2 to these values so that we could easily determine the error
with the command
err :=|q1 q2|
8.3314 10
8
There is an alternate method for generating the Taylor polynomials within the Numer-
icalAnalysis subpackage of Maple’s Student package. This subpackage will be discussed
in Chapter 2.
EXERCISE SET 1.1
1. Show that the following equations have at least one solution in the given intervals.
a. x cos x 2x
2
+ 3x 1 = 0, [0.2, 0.3] and [1.2, 1.3]
b. (x 2)
2
ln x = 0, [1, 2] and [e,4]
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Review of Calculus 15
c. 2x cos(2x) (x 2)
2
= 0, [2, 3] and [3, 4]
d. x (ln x)
x
= 0, [4, 5]
2. Find intervals containing solutions to the following equations.
a. x 3
x
= 0
b. 4x
2
e
x
= 0
c. x
3
2x
2
4x + 2 = 0
d. x
3
+ 4.001x
2
+ 4.002x + 1.101 = 0
3. Show that f
(x) is 0 at least once in the given intervals.
a. f(x) = 1 e
x
+ (e 1) sin((π/2)x), [0, 1]
b. f(x) = (x 1) tan x + x sin π x, [0, 1]
c. f(x) = x sin πx (x 2) ln x, [1, 2]
d. f(x) = (x 2) sin x ln(x + 2), [−1, 3]
4. Find max
axb
|f(x)| for the following functions and intervals.
a. f(x) = (2 e
x
+ 2x)/3, [0, 1]
b. f(x) = (4x 3)/(x
2
2x), [0.5, 1]
c. f(x) = 2x cos(2x) (x 2)
2
, [2, 4]
d. f(x) = 1 + e
cos(x1)
, [1, 2]
5. Use the Intermediate Value Theorem 1.11 and Rolle’s Theorem 1.7 to show that the graph of
f(x) = x
3
+ 2x + k crosses the x-axis exactly once, regardless of the value of the constant k.
6. Suppose f C[a, b] and f
(x) exists on (a, b). Show that if f
(x) = 0 for all x in (a, b), then there
can exist at most one number p in [a, b] with f(p) = 0.
7. Let f(x) = x
3
.
a. Find the second Taylor polynomial P
2
(x) about x
0
= 0.
b. Find R
2
(0.5) and the actual error in using P
2
(0.5) to approximate f(0.5).
c. Repeat part (a) using x
0
= 1.
d. Repeat part (b) using the polynomial from part (c).
8. Find the third Taylor polynomial P
3
(x) for the function f(x) =
x + 1 about x
0
= 0. Approximate
0.5,
0.75,
1.25, and
1.5 using P
3
(x), and find the actual errors.
9. Find the second Taylor polynomial P
2
(x) for the function f(x) = e
x
cos x about x
0
= 0.
a. Use P
2
(0.5) to approximate f(0.5). Find an upper bound for error |f(0.5) P
2
(0.5)| using the
error formula, and compare it to the actual error.
b. Find a bound for the error |f(x) P
2
(x)| in using P
2
(x) to approximate f(x) on the interval
[0, 1].
c. Approximate
1
0
f(x) dx using
1
0
P
2
(x) dx.
d. Find an upper bound for the error in (c) using
1
0
|R
2
(x) dx|, and compare the bound to the actual
error.
10. Repeat Exercise 9 using x
0
= π/6.
11. Find the third Taylor polynomial P
3
(x) for the function f(x) = (x 1) ln x about x
0
= 1.
a. Use P
3
(0.5) to approximate f(0.5). Find an upper bound for error |f(0.5) P
3
(0.5)| using the
error formula, and compare it to the actual error.
b. Find a bound for the error |f(x) P
3
(x)| in using P
3
(x) to approximate f(x) on the interval
[0.5, 1.5].
c. Approximate
1.5
0.5
f(x) dx using
1.5
0.5
P
3
(x) dx.
d. Find an upper bound for the error in (c) using
1.5
0.5
|R
3
(x) dx|, and compare the bound to the
actual error.
12. Let f(x) = 2x cos(2x) (x 2)
2
and x
0
= 0.
a. Find the third Taylor polynomial P
3
(x), and use it to approximate f(0.4).
b. Use the error formula in Taylor’s Theorem to find an upper bound for the error |f(0.4)P
3
(0.4)|.
Compute the actual error.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
16 CHAPTER 1 Mathematical Preliminaries and Error Analysis
c. Find the fourth Taylor polynomial P
4
(x), and use it to approximate f(0.4).
d. Use the error formula in Taylor’s Theorem to find an upper bound for the error |f(0.4)P
4
(0.4)|.
Compute the actual error.
13. Find the fourth Taylor polynomial P
4
(x) for the function f(x) = xe
x
2
about x
0
= 0.
a. Find an upper bound for |f(x) P
4
(x)|, for 0 x 0.4.
b. Approximate
0.4
0
f(x) dx using
0.4
0
P
4
(x) dx.
c. Find an upper bound for the error in (b) using
0.4
0
P
4
(x) dx.
d. Approximate f
(0.2) using P
4
(0.2), and find the error.
14. Use the error term of a Taylor polynomial to estimate the error involved in using sin x x to
approximate sin 1
.
15. Use a Taylor polynomial about π/4 to approximate cos 42
to an accuracy of 10
6
.
16. Let f(x) = e
x/2
sin(x/3). Use Maple to determine the following.
a. The third Maclaurin polynomial P
3
(x).
b. f
(4)
(x) and a bound for the error |f(x) P
3
(x)| on [0, 1].
17. Let f(x) = ln(x
2
+ 2). Use Maple to determine the following.
a. The Taylor polynomial P
3
(x) for f expanded about x
0
= 1.
b. The maximum error |f(x) P
3
(x)|, for 0 x 1.
c. The Maclaurin polynomial
˜
P
3
(x) for f .
d. The maximum error |f(x)
˜
P
3
(x)|, for 0 x 1.
e. Does P
3
(0) approximate f(0) better than
˜
P
3
(1) approximates f(1)?
18. Let f(x) = (1 x)
1
and x
0
= 0. Find the nth Taylor polynomial P
n
(x) for f(x) about x
0
. Find a
value of n necessary for P
n
(x) to approximate f(x) to within 10
6
on [0, 0.5].
19. Let f(x) = e
x
and x
0
= 0. Find the nth Taylor polynomial P
n
(x) for f(x) about x
0
. Find a value of n
necessary for P
n
(x) to approximate f(x) to within 10
6
on [0, 0.5].
20. Find the nth Maclaurin polynomial P
n
(x) for f(x) = arctan x.
21. The polynomial P
2
(x) = 1
1
2
x
2
is to be used to approximate f(x) = cos x in [−
1
2
,
1
2
]. Find a bound
for the maximum error.
22. The nth Taylor polynomial for a function f at x
0
is sometimes referred to as the polynomial of degree
at most n that “best” approximates f near x
0
.
a. Explain why this description is accurate.
b. Find the quadratic polynomial that best approximates a function f near x
0
= 1 if the tangent
line at x
0
= 1 has equation y = 4x 1, and if f

(1) = 6.
23. Prove the Generalized Rolle’s Theorem, Theorem 1.10, by verifying the following.
a. Use Rolle’s Theorem to show that f
(z
i
) = 0 for n 1 numbers in [a, b] with a < z
1
< z
2
<
···< z
n1
< b.
b. Use Rolle’s Theorem to show that f

(w
i
) = 0 for n 2 numbers in [a, b] with z
1
<w
1
< z
2
<
w
2
···w
n2
< z
n1
< b.
c. Continue the arguments in a. and b. to show that for each j = 1, 2, ..., n 1 there are n j
distinct numbers in [a, b] where f
(j)
is 0.
d. Show that part c. implies the conclusion of the theorem.
24. In Example 3 it is stated that for all x we have |sin x|≤|x|. Use the following to verify this statement.
a. Show that for all x 0wehavef(x) = xsin x is non-decreasing, which implies that sin x x
with equality only when x = 0.
b. Use the fact that the sine function is odd to reach the conclusion.
25. A Maclaurin polynomial for e
x
is used to give the approximation 2.5 to e. The error bound in this
approximation is established to be E =
1
6
. Find a bound for the error in E.
26. The error function defined by
erf(x) =
2
π
x
0
e
t
2
dt
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Round-off Errors and Computer Arithmetic 17
gives the probability that any one of a series of trials will lie within x units of the mean, assuming that
the trials have a normal distribution with mean 0 and standard deviation
2/2. This integral cannot
be evaluated in terms of elementary functions, so an approximating technique must be used.
a. Integrate the Maclaurin series for e
x
2
to show that
erf(x) =
2
π
k=0
(1)
k
x
2k+1
(2k + 1)k!
.
b. The error function can also be expressed in the form
erf(x) =
2
π
e
x
2
k=0
2
k
x
2k+1
1 · 3 ·5 ···(2k + 1)
.
Verify that the two series agree for k = 1, 2, 3, and 4. [Hint: Use the Maclaurin series for e
x
2
.]
c. Use the series in part (a) to approximate erf(1) to within 10
7
.
d. Use the same number of terms as in part (c) to approximate erf(1) with the series in part (b).
e. Explain why difficulties occur using the series in part (b) to approximate erf(x).
27. A function f : [a, b]→R is said to satisfy a Lipschitz condition with Lipschitz constant L on [a, b]
if, for every x, y ∈[a, b],wehave|f(x) f(y)|≤L|x y|.
a. Show that if f satisfies a Lipschitz condition with Lipschitz constant L on an interval [a, b], then
f C[a, b].
b. Show that if f has a derivative that is bounded on [a, b]by L, then f satisfies a Lipschitz condition
with Lipschitz constant L on [a, b].
c. Give an example of a function that is continuous on a closed interval but does not satisfy a
Lipschitz condition on the interval.
28. Suppose f C[a, b], that x
1
and x
2
are in [a, b].
a. Show that a number ξ exists between x
1
and x
2
with
f(ξ) =
f(x
1
) + f(x
2
)
2
=
1
2
f(x
1
) +
1
2
f(x
2
).
b. Suppose that c
1
and c
2
are positive constants. Show that a number ξ exists between x
1
and x
2
with
f(ξ) =
c
1
f(x
1
) + c
2
f(x
2
)
c
1
+ c
2
.
c. Give an example to show that the result in part b. does not necessarily hold when c
1
and c
2
have
opposite signs with c
1
=−c
2
.
29. Let f C[a, b], and let p be in the open interval (a, b).
a. Suppose f(p) = 0. Show that a δ>0 exists with f(x) = 0, for all x in [p δ, p + δ], with
[p δ, p + δ] a subset of [a, b].
b. Suppose f(p) = 0 and k > 0 is given. Show that a δ>0 exists with |f(x)|≤k, for all x in
[p δ, p + δ], with [p δ, p + δ] a subset of [a, b].
1.2 Round-off Errors and Computer Arithmetic
The arithmetic performed by a calculator or computer is different from the arithmetic in
algebra and calculus courses. You wouldlikely expect that we always have as true statements
things such as 2+2 = 4, 4·8 = 32, and (
3)
2
= 3. However, with computer arithmetic we
expect exact results for 2+2 = 4 and 4 ·8 = 32, but we will not have precisely (
3)
2
= 3.
To understand why this is true we must explore the world of finite-digit arithmetic.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
18 CHAPTER 1 Mathematical Preliminaries and Error Analysis
In our traditional mathematical world we permit numbers with an infinite number of
digits. The arithmetic we use in this world defines
3 as that unique positive number that
when multiplied by itself produces the integer 3. In the computational world, however, each
representable number has only a fixed and finite number of digits. This means, for example,
that only rational numbers—and not even all of these—can be represented exactly. Since
3 is not rational, it is given an approximate representation, one whose square will not
be precisely 3, although it will likely be sufficiently close to 3 to be acceptable in most
situations. In most cases, then, this machine arithmetic is satisfactory and passes without
notice or concern, but at times problems arise because of this discrepancy.
Error due to rounding should be
expected whenever computations
are performed using numbers that
are not powers of 2. Keeping this
error under control is extremely
important when the number of
calculations is large.
The error that is produced when a calculator or computer is used to perform real-
number calculations is called round-off error. It occurs because the arithmetic per-
formed in a machine involves numbers with only a finite number of digits, with the re-
sult that calculations are performed with only approximate representations of the actual
numbers. In a computer, only a relatively small subset of the real number system is used
for the representation of all the real numbers. This subset contains only rational numbers,
both positive and negative, and stores the fractional part, together with an exponential
part.
Binary Machine Numbers
In 1985, the IEEE (Institute for Electrical andElectronic Engineers) published a reportcalled
Binary Floating Point Arithmetic Standard 754–1985. An updated version was published
in 2008 as IEEE 754-2008. This provides standards for binary and decimal floating point
numbers, formats for data interchange, algorithms for rounding arithmetic operations, and
for the handling of exceptions. Formats are specified for single, double, and extended
precisions, and these standards are generally followed by all microcomputer manufacturers
using floating-point hardware.
A 64-bit (binary digit) representation is used for a real number. The first bit is a sign
indicator, denoted s. This is followed by an 11-bit exponent, c, called the characteristic,
and a 52-bit binary fraction, f , called the mantissa. The base for the exponent is 2.
Since 52 binary digits correspond to between 16 and 17 decimal digits, we can assume
that a number represented in this system has at least 16 decimal digits of precision. The
exponent of 11 binary digits gives a range of 0 to 2
11
1 = 2047. However, using only posi-
tive integers for the exponent would not permit an adequate representation of numbers with
small magnitude. To ensure that numbers with small magnitude are equally representable,
1023 is subtracted from the characteristic, so the range of the exponent is actually from
1023 to 1024.
To save storage and provide a unique representation for each floating-point number, a
normalization is imposed. Using this system gives a floating-point number of the form
(1)
s
2
c1023
(1 + f).
Illustration Consider the machine number
0 10000000011 1011100100010000000000000000000000000000000000000000.
The leftmost bit is s = 0, which indicates that the number is positive. The next 11 bits,
10000000011, give the characteristic and are equivalent to the decimal number
c = 1 · 2
10
+ 0 · 2
9
+···+0 · 2
2
+ 1 · 2
1
+ 1 · 2
0
= 1024 +2 +1 = 1027.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Round-off Errors and Computer Arithmetic 19
The exponential part of the number is, therefore, 2
10271023
= 2
4
. The final 52 bits specify
that the mantissa is
f = 1 ·
1
2
1
+ 1 ·
1
2
3
+ 1 ·
1
2
4
+ 1 ·
1
2
5
+ 1 ·
1
2
8
+ 1 ·
1
2
12
.
As a consequence, this machine number precisely represents the decimal number
(1)
s
2
c1023
(1 + f) = (1)
0
· 2
10271023
1 +
1
2
+
1
8
+
1
16
+
1
32
+
1
256
+
1
4096

= 27.56640625.
However, the next smallest machine number is
0 10000000011 1011100100001111111111111111111111111111111111111111,
and the next largest machine number is
0 10000000011 1011100100010000000000000000000000000000000000000001.
This means that our original machine number represents not only 27.56640625, but also half
of the real numbers that are between 27.56640625 and the next smallest machine number,
as well as half the numbers between 27.56640625 and the next largest machine number. To
be precise, it represents any real number in the interval
[27.5664062499999982236431605997495353221893310546875,
27.5664062500000017763568394002504646778106689453125).
The smallest normalized positive number that can be represented has s = 0, c = 1,
and f = 0 and is equivalent to
2
1022
· (1 + 0) 0.22251 × 10
307
,
and the largest has s = 0, c = 2046, and f = 1 2
52
and is equivalent to
2
1023
· (2 2
52
) 0.17977 × 10
309
.
Numbers occurring in calculations that have a magnitude less than
2
1022
· (1 + 0)
result in underflow and are generally set to zero. Numbers greater than
2
1023
· (2 2
52
)
result in overflow and typically cause the computations to stop (unless the program has
been designed to detect this occurrence). Note that there are two representations for the
number zero; a positive 0 when s = 0, c = 0 and f = 0, and a negative 0 when s = 1,
c = 0 and f = 0.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
20 CHAPTER 1 Mathematical Preliminaries and Error Analysis
Decimal Machine Numbers
The use of binary digits tends to conceal the computational difficulties that occur when a
finite collection of machine numbers is used to represent all the real numbers. To examine
these problems, we will use more familiar decimal numbers instead of binary representation.
Specifically, we assume that machine numbers are represented in the normalized decimal
floating-point form
±0.d
1
d
2
...d
k
× 10
n
,1 d
1
9, and 0 d
i
9,
for each i = 2, ..., k. Numbers of this form are called k-digit decimal machine numbers.
Any positive real number within the numerical range of the machine can be normalized
to the form
y = 0.d
1
d
2
...d
k
d
k+1
d
k+2
...× 10
n
.
The floating-point form of y, denoted f l(y), is obtained by terminating the mantissa of
The error that results from
replacing a number with its
floating-point form is called
round-off error regardless of
whether the rounding or
chopping method is used.
y at k decimal digits. There are two common ways of performing this termination. One
method, called chopping, is to simply chop off the digits d
k+1
d
k+2
....This produces the
floating-point form
f l(y) = 0.d
1
d
2
...d
k
× 10
n
.
The other method, called rounding, adds 5 × 10
n(k+1)
to y and then chops the result to
obtain a number of the form
f l(y) = 0.δ
1
δ
2
...δ
k
× 10
n
.
For rounding, when d
k+1
5, we add 1 to d
k
to obtain f l(y); that is, we round up. When
d
k+1
< 5, we simply chop off all but the first k digits; so we round down. If we round down,
then δ
i
= d
i
, for each i = 1, 2, ..., k. However, if we round up, the digits (and even the
exponent) might change.
Example 1 Determine the five-digit (a) chopping and (b) rounding values of the irrational number π.
Solution The number π has an infinite decimal expansion of the form π = 3.14159265....
Written in normalized decimal form, we have
π = 0.314159265 ...× 10
1
.
(a) The floating-point form of π using five-digit chopping is
f l) = 0.31415 × 10
1
= 3.1415.
(b) The sixth digit of the decimal expansion of π is a 9, so the floating-point form of
π using five-digit rounding is
f l) = (0.31415 + 0.00001) × 10
1
= 3.1416.
The following definition describes two methods for measuring approximation errors.
The relative error is generally a
better measure of accuracy than
the absolute error because it takes
into consideration the size of the
number being approximated.
Definition 1.15 Suppose that p
is an approximation to p. The absolute error is |p p
|, and the relative
error is
|p p
|
|p|
, provided that p = 0.
Consider the absolute and relative errors in representing p by p
in the following
example.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Round-off Errors and Computer Arithmetic 21
Example 2 Determine the absolute and relative errors when approximating p by p
when
(a) p = 0.3000 × 10
1
and p
= 0.3100 ×10
1
;
(b) p = 0.3000 × 10
3
and p
= 0.3100 ×10
3
;
(c) p = 0.3000 × 10
4
and p
= 0.3100 ×10
4
.
Solution
(a) For p = 0.3000 × 10
1
and p
= 0.3100 × 10
1
the absolute error is 0.1, and the
relative error is 0.333
3 × 10
1
.
(b) For p = 0.3000 ×10
3
and p
= 0.3100 ×10
3
the absolute error is 0.1 ×10
4
,
and the relative error is 0.333
3 × 10
1
.
(c) For p = 0.3000 ×10
4
and p
= 0.3100 ×10
4
, the absolute error is 0.1 ×10
3
, and
the relative error is again 0.333
3 × 10
1
.
This example shows that the same relative error, 0.333
3 ×10
1
, occurs for widely varying
absolute errors. As a measure of accuracy, the absolute error can be misleading and the
relative error more meaningful, because the relative error takes into consideration the size
of the value.
We often cannot find an accurate
value for the true error in an
approximation. Instead we find a
bound for the error, which gives
us a “worst-case” error.
The following definition uses relative error to give a measure of significant digits of
accuracy for an approximation.
Definition 1.16 The number p
is said to approximate p to t significant digits (or figures) if t is the largest
nonnegative integer for which
|p p
|
|p|
5 ×10
t
.
Table 1.1 illustrates the continuous nature of significant digits by listing, for the various
values of p, the least upper bound of |p p
|, denoted max |p p
|, when p
agrees with p
to four significant digits.
The term significant digits is
often used to loosely describe the
number of decimal digits that
appear to be accurate. The
definition is more precise, and
provides a continuous concept.
Table 1.1
p 0.1 0.5 100 1000 5000 9990 10000
max |p p
| 0.00005 0.00025 0.05 0.5 2.5 4.995 5.
Returning to the machine representation of numbers, we see that the floating-point
representation f l(y) for the number y has the relative error
y f l(y)
y
.
If k decimal digits and chopping are used for the machine representation of
y = 0.d
1
d
2
...d
k
d
k+1
...× 10
n
,
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
22 CHAPTER 1 Mathematical Preliminaries and Error Analysis
then
y f l(y)
y
=
0.d
1
d
2
...d
k
d
k+1
...× 10
n
0.d
1
d
2
...d
k
× 10
n
0.d
1
d
2
...× 10
n
=
0.d
k+1
d
k+2
...× 10
nk
0.d
1
d
2
...× 10
n
=
0.d
k+1
d
k+2
...
0.d
1
d
2
...
× 10
k
.
Since d
1
= 0, the minimal value of the denominator is 0.1. The numerator is bounded above
by 1. As a consequence,
y f l(y)
y
1
0.1
× 10
k
= 10
k+1
.
In a similar manner, a bound for the relative error when using k-digit rounding arithmetic
is 0.5 × 10
k+1
. (See Exercise 24.)
Note that the bounds for the relative error using k-digit arithmetic are independent of the
number being represented. This result is due to the manner in which the machine numbers
are distributed along the real line. Because of the exponential form of the characteristic,
the same number of decimal machine numbers is used to represent each of the intervals
[0.1, 1], [1, 10], and [10, 100]. In fact, within the limits of the machine, the number of
decimal machine numbers in [10
n
,10
n+1
] is constant for all integers n.
Finite-Digit Arithmetic
In addition to inaccurate representation of numbers, the arithmetic performed in a computer
is not exact. The arithmetic involves manipulating binary digits by various shifting, or
logical, operations. Since the actual mechanics of these operations are not pertinent to this
presentation, we shall devise our own approximation to computer arithmetic. Although our
arithmetic will not give the exact picture, it suffices to explain the problems that occur. (For
an explanation of the manipulations actually involved, the reader is urged to consult more
technically oriented computer science texts, such as [Ma], Computer System Architecture.)
Assume that the floating-point representations f l(x) and f l(y) are given for the real
numbers x and y and that the symbols , , ,
.
.
represent machine addition, subtraction,
multiplication, and divisionoperations, respectively. We will assume a finite-digit arithmetic
given by
x y = f l(f l(x) + f l(y)), x y = f l (f l(x) × f l(y)),
x y = f l(f l(x) f l(y)), x
.
.
y = f l(f l(x) ÷ f l(y)).
This arithmetic corresponds to performing exact arithmetic on the floating-point repre-
sentations of x and y and then converting the exact result to its finite-digit floating-point
representation.
Rounding arithmetic is easily implemented in Maple. For example, the command
Digits := 5
causes all arithmetic to be rounded to 5 digits. To ensure that Maple uses approximate rather
than exact arithmetic we use the evalf. For example, if x = π and y =
2 then
evalf (x); evalf (y)
produces 3.1416 and 1.4142, respectively. Then f l(f l(x) + f l(y)) is performed using
5-digit rounding arithmetic with
evalf (evalf (x) + evalf (y))
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Round-off Errors and Computer Arithmetic 23
which gives 4.5558. Implementing finite-digit chopping arithmetic is more difficult and
requires a sequence of steps or a procedure. Exercise 27 explores this problem.
Example 3 Suppose that x =
5
7
and y =
1
3
. Use five-digit chopping for calculating x + y, x y, x × y,
and x ÷y .
Solution Note that
x =
5
7
= 0.
714285 and y =
1
3
= 0.
3
implies that the five-digit chopping values of x and y are
f l(x) = 0.71428 × 10
0
and f l(y) = 0.33333 × 10
0
.
Thus
x y = f l(f l(x) + f l(y)) = f l
0.71428 × 10
0
+ 0.33333 × 10
0
= f l
1.04761 × 10
0
= 0.10476 ×10
1
.
The true value is x + y =
5
7
+
1
3
=
22
21
,sowehave
Absolute Error =
22
21
0.10476 × 10
1
= 0.190 ×10
4
and
Relative Error =
0.190 ×10
4
22/21
= 0.182 ×10
4
.
Table 1.2 lists the values of this and the other calculations.
Table 1.2
Operation Result Actual value Absolute error Relative error
x y 0.10476 × 10
1
22/21 0.190 ×10
4
0.182 ×10
4
x y 0.38095 × 10
0
8/21 0.238 ×10
5
0.625 × 10
5
x y 0.23809 × 10
0
5/21 0.524 ×10
5
0.220 ×10
4
x
.
.
y 0.21428 × 10
1
15/7 0.571 × 10
4
0.267 × 10
4
The maximum relative error for the operations in Example 3 is 0.267 × 10
4
, so the
arithmetic produces satisfactory five-digit results. This is not the case in the following
example.
Example 4 Suppose that in addition to x =
5
7
and y =
1
3
we have
u = 0.714251, v = 98765.9, and w = 0.111111 × 10
4
,
so that
f l(u) = 0.71425 × 10
0
, f l(v) = 0.98765 ×10
5
, and f l(w) = 0.11111 × 10
4
.
Determine the five-digit chopping values of x u, (x u)
.
.
w, (x u) v, and u v.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
24 CHAPTER 1 Mathematical Preliminaries and Error Analysis
Solution
These numbers were chosen to illustrate some problems that can arise with finite-
digit arithmetic. Because x and u are nearly the same, their difference is small. The absolute
error for x u is
|(x u) (x u)|=
|
(x u)
(
f l(f l(x) f l(u))
)
|
=
5
7
0.714251
f l
0.71428 × 10
0
0.71425 × 10
0

=
0.347143 × 10
4
f l
0.00003 × 10
0
= 0.47143 ×10
5
.
This approximation has a small absolute error, but a large relative error
0.47143 × 10
5
0.347143 × 10
4
0.136.
The subsequent division by the small number w or multiplication by the large number v
magnifies the absolute error without modifying the relative error. The addition of the large
and small numbers u and v produces large absolute error but not large relative error. These
calculations are shown in Table 1.3.
Table 1.3
Operation Result Actual value Absolute error Relative error
x u 0.30000 ×10
4
0.34714 ×10
4
0.471 × 10
5
0.136
(x u)
.
.
w 0.27000 ×10
1
0.31242 ×10
1
0.424 0.136
(x u) v 0.29629 ×10
1
0.34285 × 10
1
0.465 0.136
u v 0.98765 × 10
5
0.98766 × 10
5
0.161 × 10
1
0.163 × 10
4
One of the most common error-producing calculations involves the cancelation of
significant digits due to the subtraction of nearly equal numbers. Suppose two nearly equal
numbers x and y , with x > y,havethek-digit representations
f l(x) = 0.d
1
d
2
...d
p
α
p+1
α
p+2
...α
k
× 10
n
,
and
f l(y) = 0.d
1
d
2
...d
p
β
p+1
β
p+2
...β
k
× 10
n
.
The floating-point form of x y is
f l(f l(x) f l(y)) = 0.σ
p+1
σ
p+2
...σ
k
× 10
np
,
where
0.σ
p+1
σ
p+2
...σ
k
= 0.α
p+1
α
p+2
...α
k
0.β
p+1
β
p+2
...β
k
.
The floating-point number used to represent x y has at most k p digits of significance.
However, in most calculation devices, x y will be assigned k digits, with the last p being
either zero or randomly assigned. Any further calculations involving xy retain the problem
of having only k p digits of significance, since a chain of calculations is no more accurate
than its weakest portion.
If a finite-digit representation or calculation introduces an error, further enlargement of
the error occurs when dividing by a number with small magnitude (or, equivalently, when
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Round-off Errors and Computer Arithmetic 25
multiplying by a number with large magnitude). Suppose, for example, that the number z
has the finite-digit approximation z + δ, where the error δ is introduced by representation
or by previous calculation. Now divide by ε = 10
n
, where n > 0. Then
z
ε
f l
f l(z)
f l)
= (z + δ) × 10
n
.
The absolute error in this approximation, |δ10
n
, is the original absolute error, |δ|, mul-
tiplied by the factor 10
n
.
Example 5 Let p = 0.54617 and q = 0.54601. Use four-digit arithmetic to approximate p q and
determine the absolute and relative errors using (a) rounding and (b) chopping.
Solution The exact value of r = p q is r = 0.00016.
(a) Suppose the subtraction is performed using four-digit rounding arithmetic. Round-
ing p and q to four digits gives p
= 0.5462 and q
= 0.5460, respectively, and
r
= p
q
= 0.0002 is the four-digit approximation to r. Since
|r r
|
|r|
=
|0.00016 0.0002|
|0.00016|
= 0.25,
the result has only one significant digit, whereas p
and q
were accurate to four
and five significant digits, respectively.
(b) If chopping is used to obtain the four digits, the four-digit approximations to p, q,
and r are p
= 0.5461, q
= 0.5460, and r
= p
q
= 0.0001. This gives
|r r
|
|r|
=
|0.00016 0.0001|
|0.00016|
= 0.375,
which also results in only one significant digit of accuracy.
The loss of accuracy due to round-off error can often be avoided by a reformulation of
the calculations, as illustrated in the next example.
Illustration The quadratic formula states that the roots of ax
2
+ bx + c = 0, when a = 0, are
x
1
=
b +
b
2
4ac
2a
and x
2
=
b
b
2
4ac
2a
. (1.1)
Consider this formula applied to the equation x
2
+ 62.10x + 1 = 0, whose roots are
The roots x
1
and x
2
of a general
quadratic equation are related to
the coefficients by the fact that
x
1
+ x
2
=−
b
a
and
x
1
x
2
=
c
a
.
This is a special case of Vièta’s
Formulas for the coefficients of
polynomials.
approximately
x
1
=−0.01610723 and x
2
=−62.08390.
We will again use four-digit rounding arithmetic in the calculations to determine the root. In
this equation, b
2
is much larger than 4ac, so the numerator in the calculation for x
1
involves
the subtraction of nearly equal numbers. Because
b
2
4ac =
(62.10)
2
(4.000)(1.000)(1.000)
=
3856. 4.000 =
3852. = 62.06,
we have
f l(x
1
) =
62.10 +62.06
2.000
=
0.04000
2.000
=−0.02000,
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
26 CHAPTER 1 Mathematical Preliminaries and Error Analysis
a poor approximation to x
1
=−0.01611, with the large relative error
|−0.01611 + 0.02000|
|−0.01611|
2.4 ×10
1
.
On the other hand, the calculation for x
2
involves the addition of the nearly equal numbers
b and
b
2
4ac. This presents no problem since
f l(x
2
) =
62.10 62.06
2.000
=
124.2
2.000
=−62.10
has the small relative error
|−62.08 + 62.10|
|−62.08|
3.2 ×10
4
.
To obtain a more accurate four-digit rounding approximation for x
1
, we change the form of
the quadratic formula by rationalizing the numerator:
x
1
=
b +
b
2
4ac
2a
b
b
2
4ac
b
b
2
4ac
=
b
2
(b
2
4ac)
2a(b
b
2
4ac)
,
which simplifies to an alternate quadratic formula
x
1
=
2c
b +
b
2
4ac
. (1.2)
Using (1.2) gives
f l(x
1
) =
2.000
62.10 +62.06
=
2.000
124.2
=−0.01610,
which has the small relative error 6.2 × 10
4
.
The rationalization technique can also be applied to give the following alternative quadratic
formula for x
2
:
x
2
=
2c
b
b
2
4ac
. (1.3)
This is the form to use if b is a negative number. In the Illustration, however, the mistaken use
of this formula for x
2
would result in not only the subtraction of nearly equal numbers, but
also the division by the small result of this subtraction. The inaccuracy that this combination
produces,
f l(x
2
) =
2c
b
b
2
4ac
=
2.000
62.10 62.06
=
2.000
0.04000
=−50.00,
has the large relative error 1.9 × 10
1
.
The lesson: Think before you compute!
Nested Arithmetic
Accuracy loss due to round-off error can also be reduced by rearranging calculations, as
shown in the next example.
Example 6 Evaluate f(x) = x
3
6.1x
2
+ 3.2x +1.5 at x = 4.71 using three-digit arithmetic.
Solution Table 1.4 gives the intermediate results in the calculations.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Round-off Errors and Computer Arithmetic 27
Table 1.4
xx
2
x
3
6.1x
2
3.2x
Exact 4.71 22.1841 104.487111 135.32301 15.072
Three-digit (chopping) 4.71 22.1 104. 134. 15.0
Three-digit (rounding) 4.71 22.2 105. 135. 15.1
To illustrate the calculations, let us look at those involved with finding x
3
using three-
digit rounding arithmetic. First we find
x
2
= 4.71
2
= 22.1841 which rounds to 22.2.
Then we use this value of x
2
to find
x
3
= x
2
· x = 22.2 · 4.71 = 104.562 which rounds to 105.
Also,
6.1x
2
= 6.1(22.2) = 135.42 which rounds to 135,
and
3.2x = 3.2(4.71) = 15.072 which rounds to 15.1.
The exact result of the evaluation is
Exact: f(4.71) = 104.487111 135.32301 +15.072 + 1.5 =−14.263899.
Using finite-digit arithmetic, the way in which we add the results can effect the final result.
Suppose that we add left to right. Then for chopping arithmetic we have
Three-digit (chopping): f(4.71) = ((104. 134.) + 15.0) + 1.5 =−13.5,
and for rounding arithmetic we have
Three-digit (rounding): f(4.71) = ((105. 135.) + 15.1) + 1.5 =−13.4.
(You should carefully verify these results to be sure that your notion of finite-digit arithmetic
is correct.) Note that the three-digit chopping values simply retain the leading three digits,
with no rounding involved, and differ significantly from the three-digit rounding values.
The relative errors for the three-digit methods are
Chopping:
14.263899 + 13.5
14.263899
0.05, and Rounding:
14.263899 + 13.4
14.263899
0.06.
Illustration As an alternative approach, the polynomial f(x) in Example 6 can be written in a nested
manner as
Remember that chopping (or
rounding) is performed after each
calculation.
f(x) = x
3
6.1x
2
+ 3.2x +1.5 = ((x 6.1)x +3.2)x + 1.5.
Using three-digit chopping arithmetic now produces
f(4.71) = ((4.71 6.1)4.71 +3.2)4.71 + 1.5 = ((1.39)(4.71) + 3.2)4.71 +1.5
= (6.54 +3.2)4.71 +1.5 = (3.34)4.71 + 1.5 =−15.7 +1.5 =−14.2.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
28 CHAPTER 1 Mathematical Preliminaries and Error Analysis
In a similar manner, we now obtain a three-digit rounding answer of 14.3. The new relative
errors are
Three-digit (chopping):
14.263899 + 14.2
14.263899
0.0045;
Three-digit (rounding):
14.263899 + 14.3
14.263899
0.0025.
Nesting has reduced the relative error for the chopping approximation to less than 10%
of that obtained initially. For the rounding approximation the improvement has been even
more dramatic; the error in this case has been reduced by more than 95%.
Polynomials should always be expressed in nested form before performing an evalu-
ation, because this form minimizes the number of arithmetic calculations. The decreased
error in the Illustration is due to the reduction in computations from four multiplications
and three additions to two multiplications and three additions. One way to reduce round-off
error is to reduce the number of computations.
EXERCISE SET 1.2
1. Compute the absolute error and relative error in approximations of p by p
.
a. p = π, p
= 22/7 b. p = π, p
= 3.1416
c. p = e, p
= 2.718 d. p =
2, p
= 1.414
e. p = e
10
, p
= 22000 f. p = 10
π
, p
= 1400
g. p = 8!, p
= 39900 h. p = 9!, p
=
18π(9/e)
9
2. Find the largest interval in which p
must lie to approximate p with relative error at most 10
4
for
each value of p.
a. π b. e
c.
2 d.
3
7
3. Suppose p
must approximate p with relative error at most 10
3
. Find the largest interval in which
p
must lie for each value of p.
a. 150 b. 900
c. 1500 d. 90
4. Perform the following computations (i) exactly, (ii) using three-digit chopping arithmetic, and (iii)
using three-digit rounding arithmetic. (iv) Compute the relative errors in parts (ii) and (iii).
a.
4
5
+
1
3
b.
4
5
·
1
3
c.
1
3
3
11
+
3
20
d.
1
3
+
3
11
3
20
5. Use three-digit rounding arithmetic to perform the following calculations. Compute the absolute error
and relative error with the exact value determined to at least five digits.
a. 133 + 0.921 b. 133 0.499
c. (121 0.327) 119 d. (121 119) 0.327
e.
13
14
6
7
2e 5.4
f. 10π + 6e
3
62
g.
2
9
·
9
7
h.
π
22
7
1
17
6. Repeat Exercise 5 using four-digit rounding arithmetic.
7. Repeat Exercise 5 using three-digit chopping arithmetic.
8. Repeat Exercise 5 using four-digit chopping arithmetic.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Round-off Errors and Computer Arithmetic 29
9. The first three nonzero terms of the Maclaurin series for the arctangent function are x (1/3)x
3
+
(1/5)x
5
. Compute the absolute error and relative error in the following approximations of π using the
polynomial in place of the arctangent:
a. 4
arctan
1
2
+ arctan
1
3

b. 16 arctan
1
5
4 arctan
1
239
10. The number e can be defined by e =
n=0
(1/n!), where n!=n(n 1) ···2 ·1 for n = 0 and 0!=1.
Compute the absolute error and relative error in the following approximations of e:
a.
5
n=0
1
n!
b.
10
n=0
1
n!
11. Let
f(x) =
x cos x sin x
x sin x
.
a. Find lim
x0
f(x).
b. Use four-digit rounding arithmetic to evaluate f(0.1).
c. Replace each trigonometric function with its third Maclaurin polynomial, and repeat part (b).
d. The actual value is f(0.1) =−1.99899998. Find the relative error for the values obtained in
parts (b) and (c).
12. Let
f(x) =
e
x
e
x
x
.
a. Find lim
x0
(e
x
e
x
)/x.
b. Use three-digit rounding arithmetic to evaluate f(0.1).
c. Replace each exponential function with its third Maclaurin polynomial, and repeat part (b).
d. The actual value is f(0.1) = 2.003335000. Find the relative error for the values obtained in
parts (b) and (c).
13. Use four-digit rounding arithmetic and the formulas (1.1), (1.2), and (1.3) to find the most accurate
approximations to the roots of the following quadratic equations. Compute the absolute errors and
relative errors.
a.
1
3
x
2
123
4
x +
1
6
= 0
b.
1
3
x
2
+
123
4
x
1
6
= 0
c. 1.002x
2
11.01x + 0.01265 = 0
d. 1.002x
2
+ 11.01x + 0.01265 = 0
14. Repeat Exercise 13 using four-digit chopping arithmetic.
15. Use the 64-bit long real format to find the decimal equivalent of the following floating-point machine
numbers.
a. 0 10000001010 1001001100000000000000000000000000000000000000000000
b. 1 10000001010 1001001100000000000000000000000000000000000000000000
c. 0 01111111111 0101001100000000000000000000000000000000000000000000
d. 0 01111111111 0101001100000000000000000000000000000000000000000001
16. Find the next largest and smallest machine numbers in decimal form for the numbers given in Exer-
cise 15.
17. Suppose two points (x
0
, y
0
) and (x
1
, y
1
) are on a straight line with y
1
= y
0
. Two formulas are available
to find the x-intercept of the line:
x =
x
0
y
1
x
1
y
0
y
1
y
0
and x = x
0
(x
1
x
0
)y
0
y
1
y
0
.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
30 CHAPTER 1 Mathematical Preliminaries and Error Analysis
a. Show that both formulas are algebraically correct.
b. Use the data (x
0
, y
0
) = (1.31, 3.24) and (x
1
, y
1
) = (1.93, 4.76) and three-digit rounding arith-
metic to compute the x-intercept both ways. Which method is better and why?
18. The Taylor polynomial of degree n for f(x) = e
x
is
n
i=0
(x
i
/i!). Use the Taylor polynomial of degree
nine and three-digit chopping arithmetic to find an approximation to e
5
by each of the following
methods.
a. e
5
9
i=0
(5)
i
i!
=
9
i=0
(1)
i
5
i
i!
b. e
5
=
1
e
5
1
9
i=0
5
i
i!
.
c. An approximate value of e
5
correct to three digits is 6.74 × 10
3
. Which formula, (a) or (b),
gives the most accuracy, and why?
19. The two-by-two linear system
ax + by = e,
cx + dy = f ,
where a, b, c, d, e, f are given, can be solved for x and y as follows:
set m =
c
a
, provided a = 0;
d
1
= d mb;
f
1
= f me;
y =
f
1
d
1
;
x =
(e by)
a
.
Solve the following linear systems using four-digit rounding arithmetic.
a. 1.130x 6.990y = 14.20
1.013x 6.099y = 14.22
b. 8.110x + 12.20y =−0.1370
18.11x + 112.2y =−0.1376
20. Repeat Exercise 19 using four-digit chopping arithmetic.
21. a. Show that the polynomial nesting technique described in Example 6 can also be applied to the
evaluation of
f(x) = 1.01e
4x
4.62e
3x
3.11e
2x
+ 12.2e
x
1.99.
b. Use three-digit rounding arithmetic, the assumption that e
1.53
= 4.62, and the fact that e
nx
= (e
x
)
n
to evaluate f(1.53) as given in part (a).
c. Redo the calculation in part (b) by first nesting the calculations.
d. Compare the approximations in parts (b) and (c) to the true three-digit result f(1.53) =−7.61.
22. A rectangular parallelepiped has sides of length 3 cm, 4 cm, and 5 cm, measured to the nearest
centimeter. What are the best upper and lower bounds for the volume of this parallelepiped? What
are the best upper and lower bounds for the surface area?
23. Let P
n
(x) be the Maclaurin polynomial of degree n for the arctangent function. Use Maple carrying
75 decimal digits to find the value of n required to approximate π to within 10
25
using the following
formulas.
a. 4
P
n
1
2
+ P
n
1
3

b. 16P
n
1
5
4P
n
1
239
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Round-off Errors and Computer Arithmetic 31
24. Suppose that f l(y) is a k-digit rounding approximation to y. Show that
y f l(y)
y
0.5 × 10
k+1
.
[Hint: If d
k+1
< 5, then f l(y) = 0.d
1
d
2
...d
k
×10
n
.Ifd
k+1
5, then f l(y) = 0.d
1
d
2
...d
k
×10
n
+
10
nk
.]
25. The binomial coefficient
m
k
=
m!
k!(m k)!
describes the number of ways of choosing a subset of k objects from a set of m elements.
a. Suppose decimal machine numbers are of the form
±0.d
1
d
2
d
3
d
4
× 10
n
, with 1 d
1
9, 0 d
i
9, if i = 2, 3, 4 and |n|≤15.
What is the largest value of m for which the binomial coefficient
m
k
can be computed for all k
by the definition without causing overflow?
b. Show that
m
k
can also be computed by
m
k
=
m
k
m 1
k 1
···
m k + 1
1
.
c. What is the largest value of m for which the binomial coefficient
m
3
can be computed by the
formula in part (b) without causing overflow?
d. Use the equation in (b) and four-digit chopping arithmetic to compute the number of possible
5-card hands in a 52-card deck. Compute the actual and relative errors.
26. Let f C[a, b] be a function whose derivative exists on (a, b). Suppose f is to be evaluated at x
0
in (a, b), but instead of computing the actual value f(x
0
), the approximate value,
˜
f(x
0
), is the actual
value of f at x
0
+ , that is,
˜
f(x
0
) = f(x
0
+ ).
a. Use the Mean Value Theorem 1.8 to estimate the absolute error |f(x
0
)
˜
f(x
0
)| and the relative
error |f(x
0
)
˜
f(x
0
)|/|f(x
0
)|, assuming f(x
0
) = 0.
b. If = 5 × 10
6
and x
0
= 1, find bounds for the absolute and relative errors for
i. f(x) = e
x
ii. f(x) = sin x
c. Repeat part (b) with = (5 × 10
6
)x
0
and x
0
= 10.
27. The following Maple procedure chops a floating-point number x to t digits. (Use the Shift and Enter
keys at the end of each line when creating the procedure.)
chop := proc(x, t);
local e, x2;
if x = 0 then 0
else
e := ceil (evalf (log10(abs(x))));
x2:= evalf (trunc (x · 10
(te)
) · 10
(et)
);
end if
end;
Verify the procedure works for the following values.
a. x = 124.031, t = 5 b. x = 124.036, t = 5
c. x =−124.031, t = 5 d. x =−124.036, t = 5
e. x = 0.00653, t = 2 f. x = 0.00656, t = 2
g. x =−0.00653, t = 2 h. x =−0.00656, t = 2
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
32 CHAPTER 1 Mathematical Preliminaries and Error Analysis
28. The opening example to this chapter described a physical experiment involving the temperature of a
gas under pressure. In this application, we were given P = 1.00 atm, V = 0.100 m
3
, N = 0.00420 mol,
and R = 0.08206. Solving for T in the ideal gas law gives
T =
PV
NR
=
(1.00)(0.100)
(0.00420)(0.08206)
= 290.15 K = 17
C.
In the laboratory, it was found that T was 15
C under these conditions, and when the pressure was
doubled and the volume halved, T was 19
C. Assume that the data are rounded values accurate to the
places given, and show that both laboratory figures are within the bounds of accuracy for the ideal
gas law.
1.3 Algorithms and Convergence
Throughout the text we will be examining approximation procedures, called algorithms,
involving sequences of calculations. An algorithm is a procedure that describes, in an
unambiguous manner, a finite sequence of steps to be performed in a specified order. The
object of the algorithm is to implement a procedure to solve a problem or approximate a
solution to the problem.
The use of an algorithm is as old
as formal mathematics, but the
name derives from the Arabic
mathematician Muhammad
ibn-M
ˆ
sâ al-Khwarârizmî
(c. 780–850). The Latin
translation of his works begins
with the words “Dixit Algorismi”
meaning “al-Khwarârizmî says.
We use a pseudocode to describe the algorithms. This pseudocode specifies the form
of the input to be supplied and the form of the desired output. Not all numerical procedures
give satisfactory output for arbitrarily chosen input. As a consequence, a stopping technique
independent of the numerical technique is incorporated into each algorithm to avoid infinite
loops.
Two punctuation symbols are used in the algorithms:
a period (.) indicates the termination of a step,
a semicolon (;) separates tasks within a step.
Indentation is used to indicate that groups of statements are to be treated as a single entity.
Looping techniques in the algorithms are either counter-controlled, such as,
For i = 1, 2, ..., n
Set x
i
= a + i · h
or condition-controlled, such as
While i < N do Steps 3–6.
To allow for conditional execution, we use the standard
If ...then or If ...then
else
constructions.
The steps in the algorithms follow the rules of structured program construction. They
have been arranged so that there should be minimal difficulty translating pseudocode into
any programming language suitable for scientific applications.
The algorithms are liberally laced with comments. These are written in italics and
contained within parentheses to distinguish them from the algorithmic statements.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.